We propose novel methods for predictive (sparse) PCA with spatiallymisaligned data. These methods identify principal component loading vectorsthat explain as much variability in the observed data as possible, while alsoensuring the corresponding principal component scores can be predictedaccurately by means of spatial statistics at locations where air pollutionmeasurements are not available. This will make it possible to identifyimportant mixtures of air pollutants and to quantify their health effects incohort studies, where currently available methods cannot be used. Wedemonstrate the utility of predictive (sparse) PCA in simulated data and applythe approach to annual averages of particulate matter speciation data fromnational Environmental Protection Agency (EPA) regulatory monitors.
展开▼